Design and Construction of Knowledge base for Verb using MRD and Tagged Corpus
نویسندگان
چکیده
This paper represents the procedure of building syntactic knowledge base. This study is to construct basic sentence pattern automatically by using the POS-tagged corpus in balanced KAIST corpus, and electronic dictionary for Korean, and to construct syntactic knowledge base with specific information added to the lexicographer's analysis. The summary of work process will be as follows: 1) Extraction of characteristic verb targeting the high frequency verb from KAIST corpus 2) Constructing sentence pattern from each verb case frame structure extracted from MRD 3) Making out the noun categories of sentence pattern through KCP examples 4) Semantic classification of selected verb suitable for classified sentence pattern 5) Description of hyper concept to individual noun categories 6) Putting the translated words in Japanese to each noun and verb 1 * This paper has been supported by AITrc and The Korean Ministry of Science and Technology.
منابع مشابه
Construction of a Knowledge Base of Conceptual Graphs
The goal of this project was to develop a piece of software to build a Lexical Knowledge Base (LKB) automatically by extracting information from a Machine Readable Dictionary (MRD). The extracted knowledge is represented using the Conceptual Graph (CG) formalism to give explicit information about noun and verb definitions. Sentences from the MRD are tagged, parsed and translated into conceptual...
متن کاملUsing Corpus Statistics and WordNet Relations for Sense Identification
Corpus-based approaches to word sense identification have flexibility and generality but suffer from a knowledge acquisition bottleneck. We show how knowledge-based techniques can be used to open the bottleneck by automatically locating training corpora. We describe a statistical classifier that combines topical context with local cues to ident~y a word sense. The classifier is used to disambig...
متن کاملPAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملDual Distributional Verb Sense Disambiguation With Small Corpora And Machine Readable Dictionaries
This paper presents a system for unsupervised verb sense disambiguation using small corpus and a machine-readable dictionary (MRD) in Korean. The system learns a set of typical usages listed in the MRD usage examples for each of the senses of a polysemous verb in the MRD definitions using verb-object co-occurrences acquired from the corpus. This paper concentrates on the problem of data sparsen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000